An Approach towards Construction and Application of Multilingual Indo-WordNet

نویسندگان

  • Manish Sinha
  • Mahesh Reddy
  • Pushpak Bhattacharyya
چکیده

In the work reported here, we present three important related issues. 1. We present an effective method of construction of the Marathi WordNet (http://www.cfilt.iitb.ac.in/wordnet/web mwn/) using the Hindi WordNet (http://www.cfilt.iitb.ac.in/wordnet/web hwn/), both of which are being developed at IIT Bombay. Henceforth we will refer to them as MWN and HWN respectively. 2. The Synset identity is the key to connect WordNets. 3. We present an interface to browse linked Hindi and Marathi WordNets (Bilingual WordNet) simultaneously for a given word either in Hindi or in Marathi. As an application, we present Word Sense Disambiguation (WSD) of nouns in Hindi. The system has been evaluated on the Corpora provided by Central Institute of Indian Languages (http://www.ciil.org/) and the results are encouraging.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Approach towards Applying and Constructing Multilingual Indo-WordNet

In the work reported here, we present three important related issues. 1. We present an effective method of construction of the Marathi WordNet (http://www. cfilt.iitb.ac.in/wordnet/webmwn/) using the Hindi WordNet (http://www.cfilt. iitb.ac.in/wordnet/webhwn/), both of which are being developed at IIT Bombay. Henceforth we will refer to them as MWN and HWN respectively. 2. The Synset identity i...

متن کامل

Expansion of the First Hindi-Nepali Word-Net Based Bi-Lingual Dictionary and the advancement of the Human-Machine Interface

Natural Language Processing is introducing a new era in the field of Computer Science and Machine translation. HumanMachine interaction is to play a very important role in the coming centuries as the dependency of human over the machine is increasing variably. Word-Net was first introduced by Miller and Fellbaum in 1985. WordNet is a Lexical database for the Human Languages. It groups the Human...

متن کامل

IndoWordNet

India is a multilingual country where machine translation and cross lingual search are highly relevant problems. These problems require large resourceslike wordnets and lexiconsof high quality and coverage. Wordnets are lexical structures composed of synsets and semantic relations. Synsets are sets of synonyms. They are linked by semantic relations like hypernymy (is-a), meronymy (part-of), tro...

متن کامل

Unsupervised Construction of a Multilingual WordNet from Parallel Corpora

This paper outlines an approach to the unsupervised construction from unannotated parallel corpora of a lexical semantic resource akin to WordNet. The paper also describes how this resource can be used to add lexical semantic tags to the text corpus at hand. Finally, we discuss the possibility to add some of the predicates typical for WordNet to its automatically constructed multilingual versio...

متن کامل

Identifying Concepts Across Languages: A First Step towards a Corpus-based Approach to Automatic Ontology Alignment

The growing importance of multilingual information retrieval and machine translation has made multilingual ontologies an extremely valuable resource. Since the construction of an ontology from scratch is a very expensive and time consuming undertaking, it is attractive to consider ways of automatically aligning monolingual ontologies, which already exist for many of the world’s major languages....

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005